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Sequence Collection 

• Sequences and annotations collect from a variety of sources. 

Sequence Analysis 

• Aligned to the draft assembly of the genome. 

• Low quality bases trimmed from ESTs. 

• Polyadenylation sites detected. 

Seed Cluster Creation 

• Seed clusters created for each UniGene Cluster. 

• Additional seed clusters are created with potential full-length 
mRNAs not in UniGene. 

dendmic Based ^^S > 

• Seed clusters are subclustered based on the contig. 

Sequence B ased Siibelustering 

• Sequences are subclustered in transcriptome space. 

Orientation IBased Subclustering 

• EST read directions, CDS annotations, consensus splice sites, 
and polyadenylation sites are used to predict the orientation of 

the subcluster. The subcluster is resubclustered if substantial conflicts exist. 

Picking Probe Selection Regions 

• For each subcluster, regions for probe selection are selected 
from either the consensus or exemplar sequences. 

• Multiple regions may be choosen due to alternative polyadenylation sites. 

Prioritizing Sequence Selection Regions 

• Regions for probe selection are prioritized based on the quality 
of sequence and annotation. 



figure 4 BEST AVAILABLE COPY 



A1 



A2 



A3 



FIGURE 5 



Consensus Sequence 



Ex em plar m RN A Sequence 



r 



A 

1 



A 

2 



3 ' E S Ts 



A 

3 



O thet n RN A and f 
E S T S eq uen ces -\ 



n 



FIGURE 6 



200 400 600 



1400 1600 1800 2000 2200 2400 2600 



Hs.79732.0:Hs.79732.0 
Hs.79732.0:g5922008 

Hs.79732.1:g4503662 

Hs.79732.2:g5922O06 

Hs.79732.3:g5922010 

Hs.900486. 556:g 1 3661 1 92 



[:'>•:; 



Figure 7 



BEST AVAILABLE COPY 



